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ABSTRACT 

In this paper, we address the problem of recovering images 
degraded by Poisson noise, where the image is known to be- 
long to a specific class. In the proposed method, a dataset 
of clean patches from images of the class of interest is clus- 
tered using multivariate Gaussian distributions. In order to 
recover the noisy image, each noisy patch is assigned to one 
of these distributions, and the corresponding minimum mean 
squared error (MMSE) estimate is obtained. We propose to 
use a self-normalized importance sampling approach, which 
is a method of the Monte-Carlo family, for the both determin- 
ing the most likely distribution and approximating the MMSE 
estimate of the clean patch. Experimental results shows that 
our proposed method outperforms other methods for Poisson 
denoising at a low SNR regime. 

Index Terms — Image denoising, Poisson noise, class- 
specific dataset, importance sampling. 

1. INTRODUCTION 

Recovering noisy images is a fundamental problem in im- 
age processing and computer vision. In many image denois- 
ing applications, especially in low intensity (photon-limited) 
images, the noise follows a Poisson distribution. Photon- 
limited acquisition scenarios are often found in astronomical 
and medical imaging, and in many other areas. 

The classical formulation for Poisson denoising consid- 
ers the task of recovering an underlying image, the pixels of 
which are stacked in a vector x £ R+, from Poissonian ob- 
servations y £ Nq , i.e. 
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A common approach to Poisson densoising proceeds as 
follows: first, a variance stabilization transform (VST) is 
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applied ( e.g ., the Anscombe transform [1]), which approxi- 
mately turns the Poisson distribution into a Gaussian of unit 
variance. Then, one of the many existing methods for Gaus- 
sian denoising is applied to the transformed image. Finally an 
inverse transformation yields the image estimate [2, 3], How- 
ever, this approach is inaccurate for low intensity (low SNR) 
images, thus many methods denoise the Poissonian images 
without any transformation [4], Our method belongs to this 
second category. 

In many applications, the image to be recovered is known 
to belong to a certain class, such as text, face, or fingerprints. 
Knowing this information can help to learn a statistical prior 
that is better adapted to that specific class than a generic prior, 
and this prior can be learned from a dataset of clean images 
from that class. Recently this class-specific approach has 
been addressed in [5, 6], for denoising and deblurring under 
Gaussian noise. 

Some patch-based methods for recovering images de- 
graded by Gaussian noises use multivariate priors for image 
restoration to obtain maximum a posteriori (MAP) or min- 
imum mean squared error (MMSE) estimates of the clean 
image. Most of these methods fit a mixture multivariate 
Gaussian distributions to patches, either from an external 
dataset [7], or from the noisy image itself [8, 9]. Although for 
Poisson denoising, patch-based methods have been proposed 
[4, 10], there is a lack of methods that use an explicit prior 
on image patches, due to the non-Gaussian forward model, 
which makes it difficult to compute the corresponding MAP 
or MMSE estimates. On the other hand, the fitted Gaussian 
mixture prior is an approximation of the true distribution that 
characterizes the image patches. 

In this paper, we propose a class-adapted method to re- 
cover images degraded by Poisson noise using a dataset of 
clean images of the same class. In our method, a mixture 
of multivariate Gaussian distribution is fitted to the external 
dataset of clean image patches. For each noisy patch of the 
observed image, we use the self-normalized importance sam- 
pling [11], which is an approach from the family of Monte- 
Carlo methods, to both choose the proper prior distribution 
and approximate the MMSE estimation of each patch. Using 
importance sampling allows us to approximate the MMSE es- 
timate using the true distribution of image patches, if the sam- 



pies are the patches of clean images in the dataset. Further- 
more, our proposed method can be extended to other forward 
models with known noise distribution, and different prior den- 
sities of clean patches. 

In the following sections, we first briefly review the self- 
normalizing importance sampling approach. Then, we de- 
scribe the proposed method for MMSE Poisson image denot- 
ing based on self-normalizing importance sampling. Section 
4 reports experimental results. 

2. SELF-NORMALIZED IMPORTANCE SAMPLING 

Our approach uses a technique from the family of Monte- 
Carlo methods, called self -normalized importance sampling 
(SNIS) [11, 12], which we will now briefly review. Assume 
that the objective is to compute (or approximate) E[/(X)], the 
expectation of a function / : of a random variable 

X £ M . Denoting the support of the random variable X as 
1Z C the expectation is given by 

E I/( X )] = [ /(x)p(x)dx. (2) 

Jn 

Consider also that only an un-normalized version p(x) = 
cp{x) of the probability density function p(x) of X is known, 
where c is unknown. Let q(x) = dq{x) be another un- 
normalized density, where the normalizing constant d is also 
unknown, but from which samples can be more efficiently ob- 
tained than from p(x). 

SNIS produces an estimate of E[/(X)] given by 

n 

^/(XjMXj) 

En[/(x)] - , (3) 
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where w(xj) = |^y, and xi,...,x n are n independent 
and identically distributed samples from distribution g(x) 
[11, 12], This approximation can be shown to converges to 
the true value of E[/(x)] as n goes to infinity. 

3. PROPOSED METHOD 
3.1. Prior Learning 

The first step of the proposed method is to fit a set of K 
multivariate Gaussians to the patches in the external dataset 
of clean images. We adopt the so-called classification EM 
(CEM) algorithm [13], rather than a standard EM algorithm 
as in [8, 6, 7], for reasons explained below. The CEM algo- 
rithm works by alternating between the following two steps 
(after being initialized by standard A'-means clustering): 


1. From the set of patches assigned to each cluster k £ 
{1, ..., A'}, denoted X/ c , obtain estimates of the mean 
Pk and covariance matrix of the corresponding 
Gaussian density, which are simply the sample mean 
and the sample covariance of the patches in X;,.. 

2. Assign each patch to the cluster under which it has the 
highest likelihood, that is, patch x ; is assigned to X/. if 

k = arg maxA/’fx, ; p m , £ m ), 

m 

where, as usual, A /"(x; p, £) denotes a Gaussian density 
of mean p and covariance £, computed at x. 

In the denoising step, each noisy patch will be assigned to one 
of these clusters. The reason for this kind of clustering is that 
in the simple importance sampling approach, for a fixed num- 
ber of samples n, the MSE of the estimator is proportional 
to the variance of samples being averaged [12], Similarly in 
the multivariate case, for a fixed n, it has been shown that the 
MSE of the estimator of a particular entry in the vector de- 
creases as the variance of the samples of that entry, given the 
noisy patch, decreases [14], Consequently, since in practice 
we use a limited number of patches from the external dataset, 
by clustering this way we expect to reduce the estimator vari- 
ance, without increasing too much the number of samples. 

It should be noted that the above procedure needs to be 
applied once for a given dataset of class-specific images. 

3.2. Image denoising 

In the denoising step, each patch is assigned to one of the clus- 
ters obtained in the learning stage. However, we only have 
noisy patches, thus the assignment is not trivial. If the noise 
was Gaussian, the assignments could be made in closed-form, 
but this is not the case with Poisson observations. Our main 
contribution is a new method, based on SNIS, to simultane- 
ously determine the cluster and estimate the clean patch. 

Define the random variable ki £ { 1 , . . . . K } , which in- 
dicates the cluster to which the i-th patch belongs, and de- 
note the corresponding distribution as p(x, |fc,). Having a set 
of learned cluster distributions, our objective is to solve the 
following simultaneous classification and MMSE estimation 
problem, given a noisy patch y,: 

(*i, k) = argmin [ ||u - x||§ p(x|y 4 , k) dx, (4) 

( u,k ) J 

where m is the number of pixels in each patch. In other words, 
we seek the estimate and the cluster that yield the minimum 
MSE. As shown below, we solve the above problem by the al- 
ternating minimization approach, but first we need to address 
the problem of how to approximate this integral. 

First, using Bayes rule and the fact that p(yj|x, k) = 
p(y;|x) ( i.e ., given the clean patch, the noisy one does not 



depend on the cluster), the integral in (4) can be written as 

m 2 p(y*|x)p(x|fc) 


E 0l x - u ll2|yi^] = 


p(yi\k) 


dx. (5) 


Then, using SNIS, the above integral can be approximated by 

n 

H u — x j II2 p(yi\ x j) 

E n[||x- u||||yi, k\ = — (6) 

^2p(yi\^j) 

i= 1 


where the x ; , for j = are samples from the dis- 

tribution p(x\k). Exploiting the self-normalized impor- 
tance sampling formula (3), in (6) we considered p(x) = 
p(y.;|x)p(x|fc), with the unknown constant c = l/p(yi\k) 
and g(x) = p(x\k). 

The minimization with respect to k in (4), while u is fixed 
to be x,, can then be approximated as 

h = argminE n2 [||x- Xi\\l\yi,k\. (7) 

k 


Note that for computing (7), «2 samples from each distribu- 
tion p(x\k) are randomly extracted. 

The next step towards an alternating minimization method 
is a way to minimize E„[||x — uj|||y,;, k] with respect to u, 
for a given k = hi. This is the well-known MMSE esti- 
mation criterion, which is given by the posterior expectation 
E(x|yi,fc,). Computing this expectation cannot be done in 
closed-form, under the Poisson observation model and Gaus- 
sian prior p{x\k) = A /”(x; Efc)- To tackle this difficulty, 
we resort again to SNIS, 

ni 

5 > 4 p(y*l x f) 

x,: = E ni [x|y», kj] = ^ (8) 

^2p(yi\ x j) 
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where Xj , x n] are samples drawn from the distribution 
p(x\k). The approximation in (8) has been used before for 
patch-based denoising [15], but without noticing its connec- 
tion to SNIS. 

Since the multivariate Gaussian distributions fitted to the 
set of patches is only a crude approximation of the true dis- 
tribution of the patches, samples from p(x\k) may yield poor 
results in the SNIS approximation. Instead of sampling from 
p{x.\k), we directly sample form the set patches in the dataset 
assigned to that fc-th cluster, X/., which turns out to yield bet- 
ter results. 

We initialize our algorithm by u = y,. Our algorithm 
then alternates between (8) and (7), to assign each patch to 
a cluster and obtain an MMSE estimate of the patch. The 
procedure of iterative clustering and denoising has been used 


in some well-known patch-based denoising methods [16, 7]. 
The patches are returned to the original position in the image 
and are averaged in the overlapping pixels. 

Note that our method can be applied to any noise model 
with a known distribution. In this paper, we only consider 
Poissonian noise and we leave other possible noise distribu- 
tions for future work. For the Poisson case, the conditional 
distribution of a noisy patch y, given a clean patch x £ R"', 
is as shown in (1), with N = m. 


• Initialization: cluster the training patches into K 
clusters, X^ . . . X^\ via the fc-means algorithm 

• Main loop: for i = 1 

- Estimate the parameter of Gaussian distribu- 
tions 0^ = as the sample mean 

(i) 

and the sample covariance matrix of X], 

- For each patch x ; , assign it to a cluster, i. e . , 
put xj into X- , where 

kj 

kj = arg max Af(xj ; ) 

k 

• Output: {X^-.-X^} 


Fig. 1: Learning the prior from a class-specific dataset. 


4. PRACTICAL CONSIDERATIONS AND 
EXPERIMENTAL RESULTS 

The reported results in this section were obtained on the Gore 
dataset of face images [17] and on the dataset of text images 
used in [5]. For each dataset, 5 images are randomly chosen 
as test images and the rest are chosen for training. For the 
face images, we extracted 95 x 10 3 patches from the training 
data, whereas from the text dataset, 75 x 10 3 patches were 
extracted. 

The described method is computationally expensive, if it 
is applied to all patches in the dataset. However, the key to re- 
duce the computational complexity is to limit the number of 
patch samples used for estimating the clean patches and de- 
termining the cluster i.e. ni and n 2 . Our results in this section 
show that for a very limited number of samples, we obtain ac- 
ceptable results, which outperforms other Poisson denoising 
methods for the tested datasets. For determining the cluster, 
we set ?i 2 = 30 which is overall 600 patches for all k = 20 
clusters and is less than 1% of the samples in each external 
datasets. The number of samples, n 1 , used for denoising each 
patch was set to 300. So, overall for each noisy patch we 




Table 1: Denoising PSNR results (in dB) for different peak 
values of face images in the Gore face database [17]. The 
results are averaged over 5 test images. 
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NL-PCA [4] 

19.69 

22.87 

23.80 

25.01 

VST+BM3D [3] 

20.80 

23.15 

24.79 

25.41 

Poisson NL-means [18] 

21.12 

23.41 

24.73 

25.32 

P4IP [19] 

20.03 

23.78 

24.88 

25.84 

Our method 

21.31 

23.95 

25.78 

27.40 


processed 900 patches which is in computational complex- 
ity roughly similar to an internal non-local denoising with the 
patches constrained in 30 x 30 window. Unlike the original 
non-local means, which only the central pixel of each patch is 
denoised, in our method the whole patch is denoised by (8), 
the patches are then returned to the original position in the 
image and are averaged in the overlapping pixels. In order to 
further reduce the computational complexity, we extract the 
patches from the noisy image every 2 pixels along the row 
and the column of the image. 

In Table 1, the average result of PSNR of the 5 tested im- 
ages for the face dataset are compared for different methods. 
The results of our method result from two iterations of the 
alternating minimization approach. We found that increasing 
the number of iteration to more than two does not noticeably 
increase the quality of obtained image estimate, while it obvi- 
ously increases the computational cost. The reason may lie in 
the fact that the discrete variables fc,; computed in (7) stabilize 
after a couple of iterations. 

In Fig. (2), an example of denoising results for the face 
images with the peak value of 10 using the dataset in [17] 
is illustrated. It can be seen that our method also improves 
noticeably the visuals quality of image. 

An example of denoising text images is shown in Fig. (3). 
It can be seen that the general image denoising methods fail 
to reconstruct the true image properly. However our class- 
specific method outperforms by a relatively high margin. 


5. CONCLUSION 

We proposed a method for class-specific denoising of images 
degraded by Poisson noise. We proposed to use a method 
from the Monte-Carlo family, called self-normalized impor- 
tance sampling, in order to determine the prior and approxi- 
mate the MMSE estimation. The sampling approach allowed 
us to approximate the MMSE using the true underlying pri- 
ors rather than fitted multivariate Gaussian distributions. Our 
results showed that our method outperforms other methods in 
both PSNR and visual quality. 



Fig. 2: An example of denoising of a face image using the 
Gore dataset ; (a) Noisy image (Peak value=10); (b) Non- 
local PCA (PSNR=22.60); (c) VST+BM3D (PSNR=24.79); 
(d) Poisson non-local means (PSNR=24.55); (e) Proposed 
(First iteration) (PSNR=25.88); (f) Proposed (Second itera- 
tion) (PSNR=26.40). 
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Fig. 3: Comparison of denoising of general methods with our 
class-specific one for recovering the text images; (a) Origi- 
nal image (b) Noisy image (Peak value=2) (c) Non-local PCA 
(PSNR= 14.95) (d) VST+BM3D (PSNR= 14.55); (e) Proposed 
(First iteration) (PSNR=17.21); (f) Proposed (Second itera- 
tion) (PSNR= 18.64). 
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